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REMARKS 

Qaims 1-44 remain in this application. Qaim 34 has been amended, 
aaims 1-44 are pending. 

The Specification has been amended to reflect the current status of 
5 tieferenced co-pending patent applications. Claim 34 has been amended to correct 
a typographical error. No claim has been amended in response to the 35 U.S.C. 
103(a) rejection* 

Claims 1-44 stand rejected under 35 U.S.C. 103(a) as being obvious over 
U.S. Patent No. 6,510,406, to MarcMsio C^Marchisio"), in view of U.S. Patent No, 

10 6,701,305, to Holt et al. (**Holf *). Applicant traverses the rejection. To establish 
a prima facie case of obviousness: (1) there must be some suggestion or 
motivation, either in the references themselves or in the knowledge generally 
available to one of ordinary skill in the art, to modify the reference or combine the 
reference teachings; (2) there must be a reasonable expectation of success; and (3) 

15 the combined references must teach or suggest all the claim limitations. MPEP § 
2143. A prima facie case of obviousness has not been shown. 

Marchisio discloses an information retrieval system that allows a wide 
margin of uncertainty in the initial choice of keywords in a query (Abstract). The 
system computes a constrained measure of the similarity between a query vector 

20 and all documents in a term-document matrix (CoL 6, lines 35-39). The system 
parses electronic information files containing text, which may include recognizing 
acronyms, recording word positions and extracting word roots (CoL 6, lines 35- 
43). The parsing may further include generating a number of concept 
identification numbers corresponding to respective tenns to be associated with the 

25 rows of a temi-document matrix and the counting of individual terms in each of 
the files (CoL 6, lines 47-54). The system generates the term-document matrix 
based on the contents of the files parsed and the value of each cell indicates the 
liumber of occurrences of the respective term within one of the files, or, 
alternatively, the value of the cell may reflect the presence or absence of the 

30 respective term (CoL 6, lines 55-65). The system receives a user query from a 

user, consisting of a list of keywords or phrases that are parsed to generate a query 
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vector (CoL 5, lines 8-14; CoL 7, lines 27-39). The similarity between the query 
and document projections are measured as a constrained optimization problem in 
a linear transform space, which maximizes the stability of a solution at a given 
level of misfit (CoL 5, lines 14-17). The system may decompose the term- 
5 document matrix in terms of orthogonal basis ftinctions and each basis encodes 
groups of conceptually related keywords that are arranged in order of decreasing 
statistical relevance to the qnery (Abstract), 

Holt discloses retrieving information from a text data collection that 
comprises a plurality of documents, which each consist of a number of terms 

10 (Abstract; Col. 6, lines 12-16). The text data collection is represented by a term- 
by-document matrix having a plurality of entries with each entry representing the 
frequency of occurrence of a term in a respective document (CoL 6, lines 16-19). 
An orthogonal basis for a lower dimensional subspace is generally obtained from 
the term-by-docuraent matrix as part of document indexing (CoL 6, lines 19-22). 

IS A representation of at least a portion of the original matrix is projected into a 

lower dimensional subspace and those portions of the subspace representation that 
relate to terms of a query are weighted (CoL 5, lines 62-67). A plurality of 
documents are scored with respect to the query based at least partially upon the 
weighted portion of the subspace representation (CoL 6, lines 31-33). Documents 

20 can then be identified based upon ranking the scores of the documents with 
respect to the query (CoL 6, lines 34-36). 

In contrast. Claim 1 defines a system for analyzing unstructured 
documents for conceptual relationships. Claim 1 recites a histogram module 
determining a frequency of occnrrences of concepts in a set of unstructured 

25 documents, each concept representing an element occurring in one or more of the 
unstmctured documents. Claim 1 further recites a selection module selecting a 
subset of concepts out of the frequency of occurrences, grouping one or more 
concepts from the concepts subset, and assigning weights to one or more clusters 
of concepts for each group of concepts. Claim 1 farther recites a best fit module 

30 calculating a best fit approximation for each document indexed by each such 

group of concepts between the frequency of occurrences and the weighted cluster 
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for each such concept grouped into the group of concepts. Such claim is neither 
taught nor suggested by Marchisio and Holt. 

In contrast, Claim 9 defines a method for analyzing unstructured 
documents for conceptual relationships. Qaim 9 recites determining a frequency 
5 of occurrences of concepts in a set of unstructured documents, each concept 
representing an element occurring in one or more of the unstmctured documents. 
Qaim 9 further recites selecting a subset of concepts out of the frequency of 
occurrences. Claim 9 further recites grouping one or more concepts from the 
concepts subset* Claim 9 further recites assigning weights to one or more clusters 

10 of concepts for each group of concepts- Claim 9 further recites calculating a best 
j5t approximation for each document indexed by each such group of concepts 
between the frequency of occurrences and the weighted cluster for each such 
concept grouped into the group of concepts* Such claim is neither taught nor 
suggested by Marchisio and Holt. 

15 In contrast. Claim 18 defines a system for dynamically evaluating latent 

concepts in unstructured documents* Daim 18 recites an extraction module 
extracting a multiplicity of concepts from a set of unstructured documents into a 
lexicon uniquely identifying each concept and a frequency of occurrence. Claim 
18 further recites a frequency mapping module creating a frequency of occurrence 

20 representation for each documents set^ the representation providing an ordered 
corpus of the frequencies of occurrence of each concept, Qaim 18 further recites 
a concept selection module selecting a subset of concepts from the frequency of 
occurrence representation filtered against a minimal set of concepts each 
referenced in at least two documents with no document in the corpus being 

25 umeferenced* Claim 18 further recites a group generation module generating a 
group of weighted clusters of concepts selected from the concepts subset. Qaim 
18 farther recites a best fit module determining a matrix of best fit approximations 
for each document weighted against each group of weighted clusters of concepts. 
Such claim is neither taught nor suggested by Marchisio and Holt, 



concepts in unstructured documents. Claim 31 recites extracting a multiplicity of 



30 



In contrast, Claim 31 defines a method for dynamically evaluating latent 
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concepts from a set of unstructured documents into a lexicon uniquely identifying 
each concept and a frequency of occurrence. Qaim 31 further recites creating a 
frequency of occurrence representation for each documents set, the representation 
providing an ordered corpus of the frequencies of occurrence of each concept. 
5 Oaim 31 further recites selecting a subset of ooticepts frona the frequency of 
occurrence representation filtered against a minimal set of concepts each 
referenced in at least two documents with no document in the corpus being 
unreferenced. Qaim 31 further recites generating a group of weighted clusters of 
concepts selected from the concepts subset. Qaim 31 further recites determining 

10 a matrix of best fit approximations for each document weighted against each 
group of weighted clusters of concepts- Such claim is neither taught nor 
suggested by Marchisio and Holt. 

First, there must be some suggestion or motivation, either in the references 
themselves or in the knowledge generally available to one of ordinary skill in the 

15 art, to modify the reference or combine the reference teachings and there must be 
a reasonable expectation of success. The teachings or suggestion to make the 
claimed combination and the reasonable expectation of success must both be 
found in the prior art, and not based on applicant's disclosure. MPEP § 2143 
(citing /« re Vaeck, 947 F.2d 488 (Fed. Cir. 1991)). 

20 Marchisio fails to provide a suggestion or motivation to modify or 

combine with the reference teachings of Holt. Marchisio teaches fomulating a 
constrained optimization problem in a linear transform space based on a term- 
spread matrix, error-covariance matrix and the user query vector, with the term- 
spread matrix corresponding to a weighted autocorrelation of the term-document 

25 matrk and indicating an amount of variation in term usage in the information files 
and extent to which the terms are correlated (CoL 22, lines 14-44). In contrast. 
Holt teaches utilizing muttidimensional subspaces to represent semantic 
relationships that exist in a set of documents, wherein the documents are 
represented using a subspace transformation based on the distribution of the 

30 occurrence of terms in the documents (Col. 1, lines 19-23). In particular, a term- 
by-document frequency matrix is initially constructed that catalogs the 
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ftequencies of the various tenns for each of the documents, which is then 
preprocessed to define a working matrix by normalizing the columns of the tenn- 
hy-document matrix to have a unit sum, stabilizing the variance of the term 
frequencies via a non-linear function and centering the term frequencies with 
5 respect to the mean vector of the columns (CoL 3, lines 16-28). Holt teaches 
away from the use of traditional vector space methods, such as used by Marchisio 
(5fie, Holt, CoL 2, lines 30-57), in favor of subspace transformations, by 
distinguishing the vector space methods as having performance severely limited 
by the size of the document collection, particularly with respect to recomputation 

10 of term weighting factors (CoL 2, line 64 through CoL 3, line 9; Col* 5, lines 30- 
40). As a result, one of ordinary skill in the art would not find a suggestion or 
motivation to combine the teachings of Marchisio with the teachings of Holt. 

Similarly, one of ordinary skill in the art would not have a reasonable 
expectation of success in combining the teachings of Marchisio and Holt. The 

15 subspace transformation taught by Holt assigns weights to terms in a user query 
(CoL 5, lines 26-27), whereas Marchisio teaches an opposite approach of 
assigning weights to the elements of the term-document matrix (CoL 7, lines 10- 
17). Moreover, the traditional term weighting approach taught by Marchisio is 
rejected by Holt as requiring relatively time consuming and processing intensive 

20 recomputation upon the addition of new documents or removal of old documents 
from the document collection, as well as being unsuitable for some applications, 
such as the assignment of topic words, that is, words automatically generated to 
summarize a document {see. Holt, CoL 5, lines 32-46). Thus, combining the 
teachings of Marchisio and HoU would not result in a successful combination. 

25 Finally, the combined references of Marchisio and Hoh fail to teach or 

suggest all claim limitations. Preliminarily, Claims 1-17 define systems and 
methods for analyzing unstructured documents for conceptual, relationships. 
Claims 18-44 define systems and methods for dynamically evaluating latent 
concepts in unstructured documents. Neither set of claims recites a user query 

30 that is analyzed against a collection of documents, as taught by the combination of 
Marchisio and Holt. 
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More paxticularly, Marchisio teaches adding a new row to the term- 
document matrix; for each phrase in the user query, where each cell in the new 
row contains the frequency of occurrence of the phrase within the respective 
electronic information file, as detennined by the frequencies of occurrence of 
5 individual terms composing the phrase and the proximity of such concepts^ as 
determined by their relative positions in the electronic infomiation files, as 
indicated by the elements of the auxiliary data structure. Thus, the terai- 
document matrix; grows, row-by-row, based on the phrases occurring in the user 
query, whereas the frequency of occurrences of concepts recited by Claims 1, 9, 
10 18, and 31 are determined for the set of documents without reference to user 
query phrases, as taught by Marchisio* Nor does Marchisio suggest adding new 
temi-document matrix rows independently from user query phrases. 

Marchisio further teaches an auxiliary data structure that permits 
reforaiing of the term-document matrix to include rows corresponding to phrases 
15 in the user query for the purposes of processing that query. Thus, the rows in the 
auxiliary data structure are also dependent upon the phrases occuning in the user 
query, whereas the subsets of concepts recited by Qaims 1, 9, 18, and 31 are 
selected out of the frequency of o<xurrences without reference to user query 
phrases, as taught by Marchisio, Nor does Marchisio suggest including the term- 
20 document matrix rows independently from user query phrases, 

Marchisio further teaches assigning weights to the elements of the term- 
document matrix, whereas the weights recited by Qaims 1, 9, 18, and 31 are 
assigned to one or more clusters of concepts for each group of concepts and not to 
a matrix, as taught by Marchisio, Nor does Marchisio suggest assigning weights 
25 to specifically formed clusters of concepts. 

Finally, Holt teaches evaluating a score vector to determine the relative 
performance of the documents against the user query. The documents to return to 
a user are selected in a variety of methods, typically by returning the best scoring 
documents identified, for example, by applying a threshold to the individual 
30 scores, by taJking a fixed number in ranked order, or by statistical or clustering 
techniques applied to the vectors of the scores. In contrast, the best fit 
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approximation jcecited by Qaims 1, 9, 18, and 31 is calculated for each document 
indexed by each such group of concepts between the frequency of occurrences 
and the weighted cluster for each such concept grouped into the group of concepts 
and without reference to a user query, as taught by Holt. Nor does Holt suggest 

5 scoring documents independently from the user query. Thus, the combined 
references of Marchisio and Holt fail to teach or suggest all claim limitations. 

Thus, a prima facie case of obviousness has not been shown with respect 
to Qairas 1, 9, 18, and 31. Claims 2-8 are dependent on Claim 1 and are 
patentable for the above-stated reasons, and as further distinguished by the 

10 limitations recited therein. Qaims 10-17 arc dependent on Qaim 9 and are 
patentable for the above-stated reasons, and as further distinguished by the 
limitations recited therein. Oaims 19-30 are dependent on Claim 18 and are 
patentable for the above-stated reasons, and as further distinguished by the 
limitations recited therein, Qaims 32-44 are dependent on Claim 31 and are 

15 patentable for the above-stated reasons, and as further distinguished by the 
limitations recited herein. 

The prior art made of record and not relied upon has been reviewed by the 
applicant and is considered to be no more pertinent than the prior art references 
abready applied. 

20 Qaims 1-44 are believed to be in a condition for allowance. Entry of the 

foregoing amendments is requested and a Notice of Allowance is earnestly 
solicited. Please contact the undersigned at (206) 381-3900 regarding any 
questions or concerns associated with the present matter. 

Respectfully submitted, 



25 



Dated: August 27, 2004 




30 



Patnck J.S* IiS&uye, Ei 
Reg- No. 40,297 



Law Offices of Patrick Inouye 

810 Third Avenue, Suite 258 Telephone: (206) 381-3900 

Seattle, WA 98104 FacsimUe: (206) 381-3999 
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